AITopics | update policy

Collaborating Authors

update policy

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

28ce9bc954876829eeb56ff46da8e1ab-Paper.pdf

Neural Information Processing SystemsApr-25-2026, 05:11:59 GMT

artificial intelligence, machine learning, natural language, (15 more...)

Neural Information Processing Systems

Country: North America > United States > Minnesota (0.28)

Industry:

Information Technology > Security & Privacy (1.00)
Leisure & Entertainment > Sports > Tennis (0.47)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Data Science (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Differentially Private n-gram Extraction

Neural Information Processing SystemsFeb-7-2026, 23:33:23 GMT

Then, 'great tennis player' is a 3-gram as it appears in the text of

artificial intelligence, machine learning, natural language, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Canada > Quebec > Montreal (0.04)

Industry:

Leisure & Entertainment > Sports > Tennis (1.00)
Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Data Science (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Multi-Agent First Order Constrained Optimization in Policy Space

Neural Information Processing SystemsDec-26-2025, 05:06:03 GMT

In the realm of multi-agent reinforcement learning (MARL), achieving high performance is crucial for a successful multi-agent system.Meanwhile, the ability to avoid unsafe actions is becoming an urgent and imperative problem to solve for real-life applications. Whereas, it is still challenging to develop a safety-aware method for multi-agent systems in MARL. In this work, we introduce a novel approach called Multi-Agent First Order Constrained Optimization in Policy Space (MAFOCOPS), which effectively addresses the dual objectives of attaining satisfactory performance and enforcing safety constraints. Using data generated from the current policy, MAFOCOPS first finds the optimal update policy by solving a constrained optimization problem in the nonparameterized policy space. Then, the update policy is projected back into the parametric policy space to achieve a feasible policy. Notably, our method is first-order in nature, ensuring the ease of implementation, and exhibits an approximate upper bound on the worst-case constraint violation. Empirical results show that our approach achieves remarkable performance while satisfying safe constraints on several safe MARL benchmarks.

name change, order constrained optimization, policy space, (5 more...)

Neural Information Processing Systems

Genre: Research Report (0.60)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Add feedback

SAM2RL: Towards Reinforcement Learning Memory Control in Segment Anything Model 2

Adamyan, Alen, Čížek, Tomáš, Straka, Matej, Janouskova, Klara, Schmid, Martin

arXiv.org Artificial IntelligenceJul-14-2025

Segment Anything Model 2 (SAM 2) has demonstrated strong performance in object segmentation tasks and has become the state-of-the-art for visual object tracking. The model stores information from previous frames in a memory bank, enabling temporal consistency across video sequences. Recent methods augment SAM 2 with hand-crafted update rules to better handle distractors, occlusions, and object motion. We propose a fundamentally different approach using reinforcement learning for optimizing memory updates in SAM 2 by framing memory control as a sequential decision-making problem. In an overfitting setup with a separate agent per video, our method achieves a relative improvement over SAM 2 that exceeds by more than three times the gains of existing heuristics. These results reveal the untapped potential of the memory bank and highlight reinforcement learning as a powerful alternative to hand-crafted update rules for memory control in visual object tracking.

machine learning, reinforcement learning, sam 2, (16 more...)

arXiv.org Artificial Intelligence

2507.08548

Country: Europe > Czechia (0.15)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Multi-Agent First Order Constrained Optimization in Policy Space

Neural Information Processing SystemsJan-19-2025, 09:58:31 GMT

multi-agent system, order constrained optimization, policy space, (3 more...)

Neural Information Processing Systems

Genre: Research Report (0.43)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Add feedback

Agent Heterogeneity Mediates Extremism in an Adaptive Social Network Model

Bullock, Seth, Sayama, Hiroki

arXiv.org Artificial IntelligenceMay-17-2023

An existing model of opinion dynamics on an adaptive social network is extended to introduce update policy heterogeneity, representing the fact that individual differences between social animals can affect their tendency to form, and be influenced by, their social bonds with other animals. As in the original model, the opinions and social connections of a population of model agents change due to three social processes: conformity, homophily and neophily. Here, however, we explore the case in which each node's susceptibility to these three processes is parameterised by node-specific values drawn independently at random from some distribution. This introduction of heterogeneity increases both the degree of extremism and connectedness in the final population (relative to comparable homogeneous networks) and leads to significant assortativity with respect to node update policy parameters as well as node opinions. Each node's update policy parameters also predict properties of the community that they will belong to in the final network configuration. These results suggest that update policy heterogeneity in social populations may have a significant impact on the formation of extremist communities in real-world populations.

artificial intelligence, machine learning, node, (16 more...)

arXiv.org Artificial Intelligence

2305.1023

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
North America > United States > New York > Broome County > Binghamton (0.04)
South America > Venezuela (0.04)
Europe > United Kingdom > England > Bristol (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine (0.69)
Information Technology > Services (0.62)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Progressive Feature Upgrade in Semi-supervised Learning on Tabular Domain

Gharasuie, Morteza Mohammady, Wang, Fenjiao

arXiv.org Artificial IntelligenceDec-1-2022

Recent semi-supervised and self-supervised methods have shown great success in the image and text domain by utilizing augmentation techniques. Despite such success, it is not easy to transfer this success to tabular domains. It is not easy to adapt domain-specific transformations from image and language to tabular data due to mixing of different data types (continuous data and categorical data) in the tabular domain. There are a few semi-supervised works on the tabular domain that have focused on proposing new augmentation techniques for tabular data. These approaches may have shown some improvement on datasets with low-cardinality in categorical data. However, the fundamental challenges have not been tackled. The proposed methods either do not apply to datasets with high-cardinality or do not use an efficient encoding of categorical data. We propose using conditional probability representation and an efficient progressively feature upgrading framework to effectively learn representations for tabular data in semi-supervised applications. The extensive experiments show superior performance of the proposed framework and the potential application in semi-supervised settings.

artificial intelligence, machine learning, representation, (18 more...)

arXiv.org Artificial Intelligence

2212.00892

Country: North America > United States (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.35)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.35)

Add feedback

Simplified: Off-Policy vs On-Policy in Reinforcement Learning

#artificialintelligenceSep-12-2021, 11:55:30 GMT

Early on when learning Reinforcement Learning you may encounter such distinction between algorithms -- some are on-policy some off-policy. You may read many explanations, but still, ask the question: what the hell is the difference? Let's try to clarify this concept once forever. I believe that the best way to do this is by example. So let's set up a simple environment.

algorithm, reinforcement learning, update policy, (6 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.75)

Add feedback

Differentially Private Set Union

Gopi, Sivakanth, Gulhane, Pankaj, Kulkarni, Janardhan, Shen, Judy Hanwen, Shokouhi, Milad, Yekhanin, Sergey

arXiv.org Machine LearningFeb-22-2020

We study the basic operation of set union in the global model of differential privacy. In this problem, we are given a universe $U$ of items, possibly of infinite size, and a database $D$ of users. Each user $i$ contributes a subset $W_i \subseteq U$ of items. We want an ($\epsilon$,$\delta$)-differentially private algorithm which outputs a subset $S \subset \cup_i W_i$ such that the size of $S$ is as large as possible. The problem arises in countless real world applications; it is particularly ubiquitous in natural language processing (NLP) applications as vocabulary extraction. For example, discovering words, sentences, $n$-grams etc., from private text data belonging to users is an instance of the set union problem. Known algorithms for this problem proceed by collecting a subset of items from each user, taking the union of such subsets, and disclosing the items whose noisy counts fall above a certain threshold. Crucially, in the above process, the contribution of each individual user is always independent of the items held by other users, resulting in a wasteful aggregation process, where some item counts happen to be way above the threshold. We deviate from the above paradigm by allowing users to contribute their items in a $\textit{dependent fashion}$, guided by a $\textit{policy}$. In this new setting ensuring privacy is significantly delicate. We prove that any policy which has certain $\textit{contractive}$ properties would result in a differentially private algorithm. We design two new algorithms, one using Laplace noise and other Gaussian noise, as specific instances of policies satisfying the contractive properties. Our experiments show that the new algorithms significantly outperform previously known mechanisms for the problem.

algorithm, histogram, update policy, (15 more...)

arXiv.org Machine Learning

2002.09745

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Rhode Island > Providence County > Providence (0.04)
North America > United States > Arizona > Maricopa County > Phoenix (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback